A The Power of Localization for Efficiently Learning Linear Separators with Noise
نویسندگان
چکیده
We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators. We consider both the malicious noise model of Valiant [Valiant 1985; Kearns and Li 1988] and the adversarial label noise model of Kearns, Schapire, and Sellie [1994]. For malicious noise, where the adversary can corrupt both the label and the features, we provide a polynomial-time algorithm for learning linear separators in <d under isotropic log-concave distributions that can tolerate a nearly information-theoretically optimal noise rate of η = Ω( ), improving on the Ω ( 3
منابع مشابه
The Power of Localization for Efficiently Learning Linear Separators with Malicious Noise
In this paper we put forward new techniques for designing efficient algorithms for learning linear separators in the challenging malicious noise model, where an adversary may corrupt both the labels and the feature part of an η fraction of the examples. Our main result is a polynomial-time algorithm for learning linear separators in Rd under the uniform distribution that can handle a noise rate...
متن کاملRevisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces
It has been a long-standing problem to efficiently learn a linear separator using as few labels as possible. In this work, we propose an efficient perceptron-based algorithm for actively learning homogeneous linear separators under uniform distribution. Under bounded noise, where each label is flipped with probability at most η, our algorithm achieves near-optimal Õ (
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملEfficient Learning of Linear Separators under Bounded Noise
We study the learnability of linear separators in < in the presence of bounded (a.k.a Massart) noise. This is a realistic generalization of the random classification noise model, where the adversary can flip each example xwith probability η(x) ≤ η. We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the unif...
متن کاملActive Learning Models and Noise
I study active learning in general pool-based active learning models as well noisy active learning algorithms and then compare them for the class of linear separators under the uniform distribution.
متن کامل